Search CORE

527 research outputs found

The weight of phonetic substance in the structure of sound inventories

Author: Abry Christian
Badin Pierre
Boë Louis-Jean
Schwartz Jean-Luc
Vallée Nathalie
Publication venue
Publication date: 14/11/2013
Field of study

In the research field initiated by Lindblom & Liljencrants in 1972, we illustrate the possibility of giving substance to phonology, predicting the structure of phonological systems with nonphonological principles, be they listener-oriented (perceptual contrast and stability) or speaker-oriented (articulatory contrast and economy). We proposed for vowel systems the Dispersion-Focalisation Theory (Schwartz et al., 1997b). With the DFT, we can predict vowel systems using two competing perceptual constraints weighted with two parameters, respectively λ and α. The first one aims at increasing auditory distances between vowel spectra (dispersion), the second one aims at increasing the perceptual salience of each spectrum through formant proximities (focalisation). We also introduced new variants based on research in physics - namely, phase space (λ,α) and polymorphism of a given phase, or superstructures in phonological organisations (Vallée et al., 1999) which allow us to generate 85.6% of 342 UPSID systems from 3- to 7-vowel qualities. No similar theory for consonants seems to exist yet. Therefore we present in detail a typology of consonants, and then suggest ways to explain plosive vs. fricative and voiceless vs. voiced consonants predominances by i) comparing them with language acquisition data at the babbling stage and looking at the capacity to acquire relatively different linguistic systems in relation with the main degrees of freedom of the articulators; ii) showing that the places “preferred” for each manner are at least partly conditioned by the morphological constraints that facilitate or complicate, make possible or impossible the needed articulatory gestures, e.g. the complexity of the articulatory control for voicing and the aerodynamics of fricatives. A rather strict coordination between the glottis and the oral constriction is needed to produce acceptable voiced fricatives (Mawass et al., 2000). We determine that the region where the combinations of Ag (glottal area) and Ac (constriction area) values results in a balance between the voice and noise components is indeed very narrow. We thus demonstrate that some of the main tendencies in the phonological vowel and consonant structures of the world’s languages can be explained partly by sensorimotor constraints, and argue that actually phonology can take part in a theory of Perception-for-Action-Control

Hochschulschriftenserver - Universität Frankfurt am Main

Data and simulations about audiovisual asynchrony and predictability in speech perception

Author: Savariaux Christophe
Schwartz Jean-Luc
Publication venue: HAL CCSD
Publication date: 29/08/2013
Field of study

International audienceSince a paper by Chandrasekaran et al. (2009), an increasing number of neuroscience papers capitalize on the assumption that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony by Chandrasekaran et al. is valid only in very specific cases, for isolated CV syllables or at the beginning of a speech utterance. We present simple audiovisual data on plosive-vowel syllables (pa, ta, ka, ba, da, ga, ma, na) showing that audiovisual synchrony is actually rather precise when syllables are chained in sequences, as they are typically in most parts of a natural speech utterance. Then we discuss on the way the natural coordination between sound and image (combining cases of lead and lag of the visual input) is reflected in the so-called temporal integration window for audiovisual speech perception (van Wassenhove et al., 2007). We conclude by a computational proposal about predictive coding in such sequences, showing that the visual input may actually provide and enhance predictions even if it is quite synchronous with the auditory input

Hal - Université Grenoble Alpes

Perceptuo-motor biases in the perceptual organization of the height feature in French vowels

Author: Ménard Lucie
Schwartz Jean-Luc
Publication venue: 'S. Hirzel Verlag'
Publication date: 01/01/2014
Field of study

A paraître dans Acta AcusticaInternational audienceThis paper reports on the organization of the perceived vowel space in French. In a previous paper [28], we investigated the implementation of vocal height contrasts along the F1 dimension in French speakers. In this paper, we present results from perceptual identification tests performed by twelve participants who took part in the production experiment reported in the earlier paper. For each subject, stimuli presented in the identification test were synthesized in two different vowel spaces, corresponding to two different vocal tract lengths. The results showed that first, the perceived French vowels belonging to similar height degrees were aligned on stable F1 values, independent of place of articulation and roundedness, as was the case for produced vowels. Second, the produced F1 distances between height degrees correlated with the perceived F1 distances. This suggests that there is a link between perceptual and motor phonemic prototypes in the human brain. The results are discussed using the framework of the Perception for Action Control (PACT) theory, in which speech units are considered to be gestures shaped by perceptual processes

Hal - Université Grenoble Alpes

Disentangling unisensory from fusion effects in the attentional modulation of McGurk effects: a Bayesian modeling study suggests that fusion is attention-dependent

Author: Andersen Tobias
Schwartz Jean-Luc
Tiippana Kaisa
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceThe McGurk effect has been shown to be modulated by attention. However, it remains unclear whether attentional effects are due to changes in unisensory processing or in the fusion mechanism. In this paper, we used published experimental data showing that distraction of visual attention weakens the McGurk effect, to fit either the Fuzzy Logical Model of Perception (FLMP) in which the fusion mechanism is fixed, or a variant of it in which the fusion mechanism could be varied depending on attention. The latter model was associated with a larger likelihood when assessed with a Bayesian Model Selection criterion. Our findings suggest that distraction of visual attention affects fusion by decreasing the weight of the visual input

Hal - Université Grenoble Alpes

Online Research Database In Technology

Binding and unbinding the auditory and visual streams in the McGurk effect

Author: Berthommier Frédéric
Nahorna Olha
Schwartz Jean-Luc
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/08/2012
Field of study

International audienceSubjects presented with coherent auditory and visual streams generally fuse them into a single per- cept. This results in enhanced intelligibility in noise, or in visual modification of the auditory per- cept in the McGurk effect. It is classically considered that processing is done independently in the auditory and visual systems before interaction occurs at a certain representational stage, resulting in an integrated percept. However, some behavioral and neurophysiological data suggest the existence of a two-stage process. A first stage would involve binding together the appropriate pieces of audio and video information before fusion per se in a second stage. Then it should be possible to design experiments leading to unbinding . It is shown here that if a given McGurk stimulus is preceded by an incoherent audiovisual context, the amount of McGurk effect is largely reduced. Various kinds of incoherent contexts (acoustic syllables dubbed on video sentences or phonetic or temporal modi- fications of the acoustic content of a regular sequence of audiovisual syllables) can significantly reduce the McGurk effect even when they are short (less than 4s). The data are interpreted in the framework of a two-stage "binding and fusion" model for audiovisual speech perception

Crossref

Hal - Université Grenoble Alpes

A Bayesian framework for speech motor control

Author: Diard Julien
Patri Jean-François
Perrier Pascal
Schwartz Jean-Luc
Publication venue: HAL CCSD
Publication date: 10/09/2015
Field of study

International audienceThe remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the Central Nervous System selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of this approach, one of its drawbacks is the intrinsic contradiction between the concept of optimality and the observed experimental intra-speaker token-to-token variability. The present paper proposes an alternative approach by formulating feedforward optimal control in a probabilistic Bayesian modeling framework. This is illustrated by controlling a biomechanical model of the vocal tract for speech production and by comparing it with an existing optimal control model (GEPPETO). The essential elements of this optimal control model are presented first. From them the Bayesian model is constructed in a progressive way. Performance of the Bayesian model is evaluated based on computer simulations and compared to the optimal control model. This approach is shown to be appropriate for solving the speech planning problem while accounting for variability in a principled way

Hal - Université Grenoble Alpes

Modulating fusion in the McGurk effect by binding processes and contextual noise

Author: Attigodu Ganesh
Berthommier Frédéric
Nahorna Olha
Schwartz Jean-Luc
Publication venue: HAL CCSD
Publication date: 29/08/2013
Field of study

International audienceIn a series of experiments we showed that the McGurk effect may be modulated by context: applying incoherent auditory and visual material before an audiovisual target made of an audio "ba" and a video "ga" significantly decreases the McGurk effect. We interpreted this as showing the existence of an audiovisual "binding" stage controlling the fusion process. Incoherence would produce "unbinding" and result in decreasing the weight of the visual input in the fusion process. In this study, we further explore this binding stage around two experiments. Firstly we test the "rebinding" process, by presenting a short period of either coherent material or silence after the incoherent "unbinding" context. We show that coherence provides "rebinding", resulting in a recovery of the McGurk effect. In contrary, silence provides no rebinding and hence "freezes" the unbinding process, resulting in no recovery of the McGurk effect. Capitalizing on this result, in a second experiment including an incoherent unbinding context followed by a coherent rebinding context before the target, we add noise all over the contextual period, though not in the McGurk target. It appears that noise uniformly increases the rate of McGurk responses compared to the silent condition. This suggests that contextual noise increases the weight of the visual input in fusion, even if there is no noise within the target stimulus where fusion is applied. We conclude on the role of audiovisual coherence and noise in the binding process, in the framework of audiovisual speech scene analysis and the cocktail party effect

Hal - Université Grenoble Alpes

Sensory-motor interactions in speech perception, production and imitation: behavioral evidence from close shadowing, perceptuo-motor phonemic organization and imitative changes.

Author: Beautemps Denis
Sato Marc
Scarbel Lucie
Schwartz Jean-Luc
Publication venue: HAL CCSD
Publication date: 05/05/2014
Field of study

International audienceSpeech communication can be viewed as an interactive process involving a functional coupling between sensory and motor systems. In the present study, we combined three classical experimental paradigms to further test perceptuomotor interactions in both speech perception and production. In a first close shadowing experiment, auditory and audiovisual syllable identification led to faster oral than manual responses. In a second experiment, participants were asked to produce and to listen to French vowels, varying from height feature, in order to test perceptuo-motor phonemic organization and idiosyncrasies. In a third experiment, online imitative changes on the fundamental frequency in relation to acoustic vowel targets were observed in a non-interactive situation of communication during both unintentional and voluntary imitative production tasks. Altogether our results appear exquisitely in line with a functional coupling between action and perception speech systems and provide further evidence for a sensory-motor nature of speech representations

Hal - Université Grenoble Alpes

Sensory-motor interactions in speech perception, production and imitation: behavioral evidence from close shadowing, perceptuo-motor phonemic organization and imitative changes.

Author: Beautemps Denis
Sato Marc
Scarbel Lucie
Schwartz Jean-Luc
Publication venue: HAL CCSD
Publication date: 05/05/2014
Field of study

Hal - Université Grenoble Alpes

Effect of context, rebinding and noise, on audiovisual speech fusion

Author: Attigodu Ganesh
Berthommier Frédéric
Nahorna Olha
Schwartz Jean-Luc
Publication venue: HAL CCSD
Publication date: 25/08/2013
Field of study

International audienceIn a previous set of experiments we showed that audio-visual fusion during the McGurk effect may be modulated by context. A short context (2 to 4 syllables) composed of incoherent auditory and visual material significantly decreases the McGurk effect. We interpreted this as showing the existence of an audiovisual "binding" stage controlling the fusion process, and we also showed the existence of a "rebinding" process when an incoherent material is followed by a short coherent material. In this work we evaluate the role of acoustic noise superimposed to the context and to the rebinding material. We use either a coherent or incoherent context, followed, if incoherent, by a variable amount of coherent "rebinding" material, with two conditions, either silent or with superimposed speech-shaped noise. The McGurk target is presented with no acoustic noise. We confirm the existence of unbinding (lower McGurk effect with incoherent context) and rebinding (the McGurk effect is recovered with coherent rebinding). Noise uniformly increases the rate of McGurk responses compared to the silent condition. We conclude on the role of audiovisual coherence and noise in the binding process, in the framework of audiovisual speech scene analysis and the cocktail party effect

Hal - Université Grenoble Alpes